New Uses For FSAVE: FXSAVE
roy g biv / defjam
New Uses For FSAVE roy g biv / defjam -= defjam =- since 1992 bringing you the viruses of tomorrow today! Former DOS/Win16 virus writer, author of several virus families, including Ginger (see Coderz #1 zine for terrible buggy example, contact me for better sources ;), and Virus Bulletin 9/95 for a description of what they called Rainbow. Co-author of world's first virus using circular partition trick (Orsam, coded with Prototype in 1993). Designer of world's first XMS swapping virus (John Galt, coded by RT Fishel in 1995, only 30 bytes stub, the rest is swapped out). Author of world's first virus using Thread Local Storage for replication (Shrug, see Virus Bulletin 6/02 for a description, but they call it Chiton), world's first virus using Visual Basic 5/6 language extensions for replication (OU812), world's first Native executable virus (Chthon), world's first virus using process co-operation to prevent termination (Gemini, see Virus Bulletin 9/02 for a description), world's first virus using polymorphic SMTP headers (JunkMail, see Virus Bulletin 11/02 for a description), world's first viruses that can convert any data files to infectable objects (Pretext), world's first 32/64-bit parasitic EPO .NET virus (Croissant, see Virus Bulletin 11/04 for a description, but they call it Impanate), world's first virus using self-executing HTML (JunkHTMaiL, see Virus Bulletin 7/03 for a description), world's first virus for Win64 on Intel Itanium (Shrug, see Virus Bulletin 6/04 for a description, but they call it Rugrat), world's first virus for Win64 on AMD AMD64 (Shrug), world's first cross-infecting virus for Intel IA32 and AMD AMD64 (Shrug), world's first viruses that infect Office applications and script files using the same code (Macaroni, see Virus Bulletin 11/05 for a description, but they call it Macar), world's first viruses that can infect both VBS and JScript using the same code (ACDC, see Virus Bulletin 11/05 for a description, but they call it Cada), world's first virus that can infect CHM files (Charm, see Virus Bulletin 10/06 for a description, but they call it Chamb), world's first IDA plugin virus (Hidan, see Virus Bulletin 3/07 for a description), world's first viruses that use the Microsoft Script Encoder to dynamically encrypt the virus body (Screed), world's first virus for StarOffice and OpenOffice (Starbucks), world's first virus IDC virus (ID10TiC), world's first polymorphic virus for Win64 on AMD AMD64 (Boundary, see Virus Bulletin 12/06 for a description, but they call it Bounds), world's first virus that can infect Intel-format and PowerPC-format Mach-O files (MachoMan, see Virus Bulletin 01/07 for a description, but they call it Macarena), world's first virus that uses Unicode escapes to dynamically encrypt the virus body, world's first self-executing PIF (Spiffy), world's first self-executing LNK (WeakLNK), world's first virus that uses virtual code (Relock), and world's first virus to use FSAVE for instruction reordering (Mimix). Author of various retrovirus articles (eg see Vlad #7 for the strings that make your code invisible to TBScan). This is my seventeenth virus for Win32. It is sad that this intro is longer than the text. FXSAVE Last time we talked about MMX for instruction reordering. It is great because we have 64-bit registers to use and there are no restrictions on the order of instructions or the arithmetic operations. Now we take it even further: the XMM registers. XMM registers require the FXSAVE instruction, instead of FSAVE for FPU and MMX. This is because the XMM registers do not use the FPU stack, so you can mix FPU/MMX and XMM in the same code with no problems. When FXSAVE is executed, it will store the state of the FPU and the XMM to the specified memory location. The format of the state is known. It looks like this: Offset Size Name Value 0x000 0x02 Control variable 0x002 0x02 Status 0x0000 0x004 0x02 Tag 0x0000 because we fill all registers 0x006 0x02 Opcode variable (instruction & 0x03ff) 0x00a 0x04 LastEip variable 0x00c 0x02 LastCS variable 0x00e 0x02 Filler 0x0000 0x010 0x04 LastData variable 0x014 0x02 LastDS variable 0x016 0x02 Filler 0x0000 0x018 0x04 MXCSR variable 0x01c 0x04 MXCSR_MASK variable 0x020 0x0a st0/mm0 user-defined 0x02a 0x06 pad 0x000000000000 0x030 0x0a st1/mm1 user-defined 0x03a 0x06 pad 0x000000000000 0x040 0x0a st2/mm2 user-defined 0x04a 0x06 pad 0x000000000000 0x050 0x0a st3/mm3 user-defined 0x05a 0x06 pad 0x000000000000 0x060 0x0a st4/mm4 user-defined 0x06a 0x06 pad 0x000000000000 0x070 0x0a st5/mm5 user-defined 0x07a 0x06 pad 0x000000000000 0x080 0x0a st6/mm6 user-defined 0x08a 0x06 pad 0x000000000000 0x090 0x0a st7/mm7 user-defined 0x09a 0x06 pad 0x000000000000 0x0a0 0x10 xmm0 user-defined 0x0b0 0x10 xmm1 user-defined 0x0c0 0x10 xmm2 user-defined 0x0d0 0x10 xmm3 user-defined 0x0e0 0x10 xmm4 user-defined 0x0f0 0x10 xmm5 user-defined 0x100 0x10 xmm6 user-defined 0x110 0x10 xmm7 user-defined 0x120 0x10 filler 0x00000000000000000000000000000000 0x130 0x10 filler 0x00000000000000000000000000000000 0x140 0x10 filler 0x00000000000000000000000000000000 0x150 0x10 filler 0x00000000000000000000000000000000 0x160 0x10 filler 0x00000000000000000000000000000000 0x170 0x10 filler 0x00000000000000000000000000000000 0x180 0x10 filler 0x00000000000000000000000000000000 0x190 0x10 filler 0x00000000000000000000000000000000 0x1a0 0x10 filler 0x00000000000000000000000000000000 0x1b0 0x10 filler 0x00000000000000000000000000000000 0x1c0 0x10 filler 0x00000000000000000000000000000000 0x1d0 0x10 filler 0x00000000000000000000000000000000 0x1e0 0x10 filler 0x00000000000000000000000000000000 0x1f0 0x10 filler 0x00000000000000000000000000000000 Since the xmm array is at the end and we can control it completely, I got the idea to put instructions into it. Then if I use FXSAVE at eip-0x99, and add a jump to skip the reserved part, so xmm0 will be the next instruction to execute. It also means that I can write 0x80 bytes of code to memory using one instruction! No other instruction can do the same. XMM has the same advantage as MMX, which is that the registers can be read in any order. There are no restriction about the address of the values. Just remember that the address for FXSAVE must be 16-bytes aligned, else a fault will occur. load: pop eax ;can be any register movdqu xmm0, oword ptr [eax + x0] movdqu xmm1, oword ptr [eax + x1] movdqu xmm2, oword ptr [eax + x2] movdqu xmm3, oword ptr [eax + x3] movdqu xmm4, oword ptr [eax + x4] movdqu xmm5, oword ptr [eax + x5] movdqu xmm6, oword ptr [eax + x6] movdqu xmm7, oword ptr [eax + x7] ;can be in any order jmp $ + 70 ;to fxsave [here can be a decryptor for XMM registers] fxsave byte ptr [eax - 99] ;overwrite call with xmm start: call load [here can be values to load into xmm array] We do more, though. Since XMM registers can hold any value, we can perform some operations on the registers, in the same way that we can do it for CPU registers. Those operations can be the simple ones: add (paddq), sub (psubq), xor (pxor), but it is enough. A 0x80 bytes array might not sound very large, but we can put a nice decryptor there, and make another one using the XMM registers. Since probably no AV emulators support XMM at all, it doesn't even need to be very complex if they can't run it. Greets to friendly people (A-Z): Active - Benny - izee - Malum - Obleak - Prototype - Ratter - Ronin - RT Fishel - sars - SPTH - The Gingerbread Man - Ultras - uNdErX - Vallez - Vecna - VirusBuster - Whitehead rgb/defjam mar 2008 iam_rgb@hotmail.com